172 research outputs found
Pool-Based Sequential Active Learning for Regression
Active learning is a machine learning approach for reducing the data labeling
effort. Given a pool of unlabeled samples, it tries to select the most useful
ones to label so that a model built from them can achieve the best possible
performance. This paper focuses on pool-based sequential active learning for
regression (ALR). We first propose three essential criteria that an ALR
approach should consider in selecting the most useful unlabeled samples:
informativeness, representativeness, and diversity, and compare four existing
ALR approaches against them. We then propose a new ALR approach using passive
sampling, which considers both the representativeness and the diversity in both
the initialization and subsequent iterations. Remarkably, this approach can
also be integrated with other existing ALR approaches in the literature to
further improve the performance. Extensive experiments on 11 UCI, CMU StatLib,
and UFL Media Core datasets from various domains verified the effectiveness of
our proposed ALR approaches
On the Vulnerability of CNN Classifiers in EEG-Based BCIs
Deep learning has been successfully used in numerous applications because of
its outstanding performance and the ability to avoid manual feature
engineering. One such application is electroencephalogram (EEG) based
brain-computer interface (BCI), where multiple convolutional neural network
(CNN) models have been proposed for EEG classification. However, it has been
found that deep learning models can be easily fooled with adversarial examples,
which are normal examples with small deliberate perturbations. This paper
proposes an unsupervised fast gradient sign method (UFGSM) to attack three
popular CNN classifiers in BCIs, and demonstrates its effectiveness. We also
verify the transferability of adversarial examples in BCIs, which means we can
perform attacks even without knowing the architecture and parameters of the
target models, or the datasets they were trained on. To our knowledge, this is
the first study on the vulnerability of CNN classifiers in EEG-based BCIs, and
hopefully will trigger more attention on the security of BCI systems
Different Set Domain Adaptation for Brain-Computer Interfaces: A Label Alignment Approach
A brain-computer interface (BCI) system usually needs a long calibration
session for each new subject/task to adjust its parameters, which impedes its
transition from the laboratory to real-world applications. Domain adaptation,
which leverages labeled data from auxiliary subjects/tasks (source domains),
has demonstrated its effectiveness in reducing such calibration effort.
Currently, most domain adaptation approaches require the source domains to have
the same feature space and label space as the target domain, which limits their
applications, as the auxiliary data may have different feature spaces and/or
different label spaces. This paper considers different set domain adaptation
for BCIs, i.e., the source and target domains have different label spaces. We
introduce a practical setting of different label sets for BCIs, and propose a
novel label alignment (LA) approach to align the source label space with the
target label space. It has three desirable properties: 1) LA only needs as few
as one labeled sample from each class of the target subject; 2) LA can be used
as a preprocessing step before different feature extraction and classification
algorithms; and, 3) LA can be integrated with other domain adaptation
approaches to achieve even better performance. Experiments on two motor imagery
datasets demonstrated the effectiveness of LA.Comment: IEEE Trans. on Neural Systems and Rehabilitation Engineering, 202
Empirical Studies on the Properties of Linear Regions in Deep Neural Networks
A deep neural network (DNN) with piecewise linear activations can partition
the input space into numerous small linear regions, where different linear
functions are fitted. It is believed that the number of these regions
represents the expressivity of the DNN. This paper provides a novel and
meticulous perspective to look into DNNs: Instead of just counting the number
of the linear regions, we study their local properties, such as the inspheres,
the directions of the corresponding hyperplanes, the decision boundaries, and
the relevance of the surrounding regions. We empirically observed that
different optimization techniques lead to completely different linear regions,
even though they result in similar classification accuracies. We hope our study
can inspire the design of novel optimization techniques, and help discover and
analyze the behaviors of DNNs.Comment: Int'l. Conf. on Learning Representations (ICLR), Addis Ababa,
Ethiopia, April 202
Active Stacking for Heart Rate Estimation
Heart rate estimation from electrocardiogram signals is very important for
the early detection of cardiovascular diseases. However, due to large
individual differences and varying electrocardiogram signal quality, there does
not exist a single reliable estimation algorithm that works well on all
subjects. Every algorithm may break down on certain subjects, resulting in a
significant estimation error. Ensemble regression, which aggregates the outputs
of multiple base estimators for more reliable and stable estimates, can be used
to remedy this problem. Moreover, active learning can be used to optimally
select a few trials from a new subject to label, based on which a stacking
ensemble regression model can be trained to aggregate the base estimators. This
paper proposes four active stacking approaches, and demonstrates that they all
significantly outperform three common unsupervised ensemble regression
approaches, and a supervised stacking approach which randomly selects some
trials to label. Remarkably, our active stacking approaches only need three or
four labeled trials from each subject to achieve an average root mean squared
estimation error below three beats per minute, making them very convenient for
real-world applications. To our knowledge, this is the first research on active
stacking, and its application to heart rate estimation
Optimize TSK Fuzzy Systems for Regression Problems: Mini-Batch Gradient Descent with Regularization, DropRule and AdaBound (MBGD-RDA)
Takagi-Sugeno-Kang (TSK) fuzzy systems are very useful machine learning
models for regression problems. However, to our knowledge, there has not
existed an efficient and effective training algorithm that ensures their
generalization performance, and also enables them to deal with big data.
Inspired by the connections between TSK fuzzy systems and neural networks, we
extend three powerful neural network optimization techniques, i.e., mini-batch
gradient descent, regularization, and AdaBound, to TSK fuzzy systems, and also
propose three novel techniques (DropRule, DropMF, and DropMembership)
specifically for training TSK fuzzy systems. Our final algorithm, mini-batch
gradient descent with regularization, DropRule and AdaBound (MBGD-RDA), can
achieve fast convergence in training TSK fuzzy systems, and also superior
generalization performance in testing. It can be used for training TSK fuzzy
systems on datasets of any size; however, it is particularly useful for big
datasets, on which currently no other efficient training algorithms exist
MBGD-RDA Training and Rule Pruning for Concise TSK Fuzzy Regression Models
To effectively train Takagi-Sugeno-Kang (TSK) fuzzy systems for regression
problems, a Mini-Batch Gradient Descent with Regularization, DropRule, and
AdaBound (MBGD-RDA) algorithm was recently proposed. It has demonstrated
superior performances; however, there are also some limitations, e.g., it does
not allow the user to specify the number of rules directly, and only Gaussian
MFs can be used. This paper proposes two variants of MBGD-RDA to remedy these
limitations, and show that they outperform the original MBGD-RDA and the
classical ANFIS algorithms with the same number of rules. Furthermore, we also
propose a rule pruning algorithm for TSK fuzzy systems, which can reduce the
number of rules without significantly sacrificing the regression performance.
Experiments showed that the rules obtained from pruning are generally better
than training them from scratch directly, especially when Gaussian MFs are
used
Active Semi-supervised Transfer Learning (ASTL) for Offline BCI Calibration
Single-trial classification of event-related potentials in
electroencephalogram (EEG) signals is a very important paradigm of
brain-computer interface (BCI). Because of individual differences, usually some
subject-specific calibration data are required to tailor the classifier for
each subject. Transfer learning has been extensively used to reduce such
calibration data requirement, by making use of auxiliary data from
similar/relevant subjects/tasks. However, all previous research assumes that
all auxiliary data have been labeled. This paper considers a more general
scenario, in which part of the auxiliary data could be unlabeled. We propose
active semi-supervised transfer learning (ASTL) for offline BCI calibration,
which integrates active learning, semi-supervised learning, and transfer
learning. Using a visual evoked potential oddball task and three different EEG
headsets, we demonstrate that ASTL can achieve consistently good performance
across subjects and headsets, and it outperforms some state-of-the-art
approaches in the literature.Comment: IEEE Int'l. Conf. on Systems, Man and Cybernetics, Banff, Canada,
2017. arXiv admin note: substantial text overlap with arXiv:1702.0289
Transfer Learning Enhanced Common Spatial Pattern Filtering for Brain Computer Interfaces (BCIs): Overview and a New Approach
The electroencephalogram (EEG) is the most widely used input for brain
computer interfaces (BCIs), and common spatial pattern (CSP) is frequently used
to spatially filter it to increase its signal-to-noise ratio. However, CSP is a
supervised filter, which needs some subject-specific calibration data to
design. This is time-consuming and not user-friendly. A promising approach for
shortening or even completely eliminating this calibration session is transfer
learning, which leverages relevant data or knowledge from other subjects or
tasks. This paper reviews three existing approaches for incorporating transfer
learning into CSP, and also proposes a new transfer learning enhanced CSP
approach. Experiments on motor imagery classification demonstrate their
effectiveness. Particularly, our proposed approach achieves the best
performance when the number of target domain calibration samples is small
Unsupervised Pool-Based Active Learning for Linear Regression
In many real-world machine learning applications, unlabeled data can be
easily obtained, but it is very time-consuming and/or expensive to label them.
So, it is desirable to be able to select the optimal samples to label, so that
a good machine learning model can be trained from a minimum amount of labeled
data. Active learning (AL) has been widely used for this purpose. However, most
existing AL approaches are supervised: they train an initial model from a small
amount of labeled samples, query new samples based on the model, and then
update the model iteratively. Few of them have considered the completely
unsupervised AL problem, i.e., starting from zero, how to optimally select the
very first few samples to label, without knowing any label information at all.
This problem is very challenging, as no label information can be utilized. This
paper studies unsupervised pool-based AL for linear regression problems. We
propose a novel AL approach that considers simultaneously the informativeness,
representativeness, and diversity, three essential criteria in AL. Extensive
experiments on 14 datasets from various application domains, using three
different linear regression models (ridge regression, LASSO, and linear support
vector regression), demonstrated the effectiveness of our proposed approach
- …